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Abstract 

q \ Consider communication over the binary erasure channel BEC(e) using random 

low-density parity-check codes with finite-blocklength n from 'standard' ensembles. 
We show that large error events is conveniently described within a scaling theory, 
and explain how to estimate heuristically their effect. Among other quantities, 
we consider the finite length threshold e*(n), defined by requiring a block error 
probability Pb = 1/2. For ensembles with minimum variable degree larger than 
two, the following expression is argued to hold 



: e*{n)=e* - e\n- 2 ' z + @{n- 1 ) 

with a calculable shift parameter e* > 0. 

> : 

•rH . 

^ 1 Introduction 

. P.. 

Assessing the performances of finite-blocklength iterative coding systems is an important 
open issue in modern coding theory. A particular case of such a task consists in the study 
of low-density parity-check code (LDPC) ensembles, when used for communicating over 
the binary erasure channel BEC(e). A consistent effort has been devoted to this case, 
with the hope of a positive feedback on the general problem [U HDl EEH EE3 EI- Some 
lessons can be drawn from the results obtained so far: 

Approximate! While density evolution (DE) provides exact thresholds in the large 
blocklength limit, there is little hope to compute exact performances (bit or block error 
rates Pb and Pb) at finite blocklength n. For the BEC(e), Pb and Pb are determined by 
a set of recursions [3] whose evaluation has complexity Q(n K ). However the exponent k 
grows with the irregularity of the ensemble (more precisely, with the number of proba- 
bilistically inequivalent types of node in the Tanner graph). Given the large degree of 
irregularity necessary for approaching capacity, an exact calculation becomes prohibitive 



already for moderate blocklengths. The situation can unlikely be simpler for more general 
channel models. 

It is therefore crucial to develop approximate estimates of finite-length performances. 

Small error events. Consider, for the sake of simplicity, communication over 
BEC(e) using an LDPC ensemble. Below the threshold e* for iterative decoding, the 
typical size of error events (the number of erased bits after decoding) is of order 1. 
Above threshold, the same size is of order n (the bit error probability is finite). One can 
regard the failure of iterative decoding at e* as due to the divergence (on the 0(1) scale) 
of the size of typical error events. 

The probability of small error events is readily evaluated through the union bound. 
For a code with minimum variable degree Z m i n , the expected number of error events 
involving E bits is ®(n E ~\ Elmhx ' 2 ^e E ), as long as E is kept finite in the n — > oo limit. 
If ^min > 2, this quantity decreases by a factor n (or more) each time E increases by 
one. This suggests that only very small error events have a non-negligible probability: 
their contribution can be computed exactly ^U] and compares favorably with recursive 
calculations or numerical simulations. Moreover, in the vast majority of elements from 
the ensemble, such error events are strictly absent. 

If ^min = 2, the dominating error events involve uniquely degree 2 variable nodes, 
and have the topology of cycles in the Tanner graph. The number of such structures is 
Q(n 2E ): one can chose both the variables and the check nodes involved. Their probability 
is Q(A E e E n~ 2E ), where we denoted by A; the fraction of variable nodes having degree 
I. In fact: the variables must be erased (which explains the factor e E ); they must have 
degree 2 (factor Af); and they must be connected to the previously chosen check nodes 
(factor n~ 2E ). Unlike in the case / min > 2, the resulting typical size depends upon e. 
Detailed calculations can be carried on for cycle ensembles: one finds E typ ~ |e* — e| _1 as 
e t e*. The divergence of error events size 'drives' the failure of iterative decoding above 
e*. 

Large error events. Computing the probability of small (i.e. finite in the n — > oo 
limit) error events is a conceptually straightforward task. While in the case l m i n > 2 one 
has do the computation for just a few small structures, if degree-two nodes are present, 
an infinite number of contributions must be summed over. In both cases, this approach 
yields a simple and accurate description of the error probability in the noise regime for 
which Pb = 0(n 1_ ^min/2l-j_ This is the so-called error- floor region. 

What about the waterfall region? The description in terms of finite-size error events 
cannot account for this regime. Take for instance the case l m i n > 2. As long as the error 
size E is finite, its probability decreases rapidly with E. The breakdown of iterative 
decoding at e* can therefore be ascribed uniquely to error events whose size diverges 
with n. Analogously, for cycle ensembles the typical size of finite error events diverges at 
e*. This conclusion can be extended to general irregular ensembles: in order to describe 
the waterfall region, large error events have to be taken into account. 

This remark implies several theoretical problems. First of all, no enumeration of all 
the configurations (by this we mean stopping sets in the BEC(e) case, and any suitable 
generalization for other channels) responsible for errors is possible in the waterfall regime. 
In fact, it is likely that the number of relevant 'topologies' diverges with the blocklength. 
Second, we cannot think of the set of wrongly decoded bits as the union of several small 
'error events', which are probabilistically independent. In more practical terms: the 



union bound is not a reliable tool in this regime. 

Yet, as optimized ensembles approach capacity, controlling the waterfall region is of 
utmost interest. We developed an approach to this problem for the BEC(e) case, which 
yields extremely satisfactory results. The methods are complementary to the stopping- 
sets analysis and build upon the description of iterative decoding by Luby et al. [HI Ej. 
Although the extension to general channel models is likely to require a considerable effort, 
one can learn a general lesson about which kind of characterization can be hoped for in 
the waterfall regime. 

Consider, for the sake of simplicity the case of / min > 2. We find that there exists a 
non- negative constant v and some non-negative function f(z) so that 

limP B (¥») = /W, (1) 

n—>oo 

where the n — > oo limit is taken by keeping n^(e* — e ri ) = z fixed. In other words, if one 
plots Pb(™,€) as a function of z = n~(e* — e) then, for increasing n these finite-length 
curves are expected to converge to some function f(z). The function f(z) decreases 
smoothly from 1 to as its argument changes from — oo to +oo. This means that all 
finite-length curves are, to first order, scaled versions of some mother curve f(z). It might 
be helpful to think of the threshold e* as the zero order term in a Taylor series. Then 
the above scaling, if correct, represents the first order term. In fact, one can even refine 
the analysis to include higher order terms and write 

P B (n,e) = f(z)+n- u g(z) + o(n- u ), (2) 

where u is some positive real number and g(z) is the second order correction term. 

For ensembles with / min = 2 (and in particular, for ensembles whose threshold is fixed 
by the local stability condition) Eq. ((TJ must be properly generalized. We refer to Ref. [3] 
for an example. 

It is worth making a couple of remarks. First of all, the form wherever it can 
be argued to hold, allows a precise definition of what is meant by "waterfall" region. 
This is going to be the interval of channel parameters e* — C-rT~ < e < e* + C + n~~ 
for some positive constants C_ and C + . Second, even in cases in which the function 
f(z) cannot be determined analytically, the statement is highly informative, since it 
reduces a two- variable function to a single- variable one. Moreover, in several cases, f(z) 
can be efficiently given in terms of a few parameters. This opens the way to empirical 
applications of Eq. (Q). An example is provided in Fig. 

Finite-length optimization of code ensembles is an issue of great practical relevance. 
We think that the scaling description ©-(0) may be an important step towards a math- 
ematically well-founded solution of this task. 

A first numerical investigation of finite-size scaling for LDPC codes was presented in 
Ref. jH]. Earlier accounts of the present work appeared in an d a complete version 

in [3]. Related ideas were put forward by Lee and Blahut [6 and Zemor and Cohen [15] . 

2 Heuristic arguments 

In this paper we consider standard LDPC(n, X,p) ensembles. Here n is the blocklength, 
and A and p denote the degree distribution of (respectively) variable and check nodes 
from an edge perspective. For the sake of simplicity, we shall often refer to regular 
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Figure 1: Scaling of Pb(w, a) for transmission over BAWGNC(cr) and a quantized ver- 
sion of belief propagation decoding implemented in hardware. The threshold for this 
combination is (E b /N )* dB « 1.19658. The blocklengths n are n = 1000, 2000, 4000, 
8000, 16000 and 32000, respectively. The solid curves represent the simulated ensemble 
averages. The dashed curves are computed according to the refined scaling law (|10|) 
with scaling parameters a = 0.8694 and (3 = 5.884. These parameters were fitted to the 
empirical data. 

ensembles with left degree /, and right degree k. The corresponding degree distributions 
read \(x) = x l and p(x) = x k . 

In order to analyze the iterative decoder, we adopt the point of view introduced by 
Luby et al. in [HI E] - According to this description, the algorithm proceeds as follows 
(we assume the all-zeros codeword to be transmitted and describe the action of the 
algorithm on the Tanner graph). Given the received message, the decoder deletes all 
received variable nodes and their incident edges. In this way one arrives at a residual 
graph. The decoder proceeds now in an iterative fashion. If the residual graph contains 
no degree-one check nodes the decoding process stops. Otherwise, the decoder randomly 
chooses one such degree-one check node and deletes it together with the corresponding 
variable node and all its incident edges. In this way a new residual graph results and a 
new iteration starts. Decoding is successful if all the graph gets deleted by this procedure. 
In the opposite case, the decoder gets stucked in a stopping set. 

The state of the algorithm after a fixed number of iterations can be entirely described 
by a finite set of integers, providing the number of variable and check nodes of a given 
degree. Let us denote this vector of integers by x. For regular ensembles the situation is 
even simpler: it is enough to specify the total number of variable nodes v, the number of 
degree-one check nodes s, and the number of check nodes of higher degree t. Therefore, in 
this case x = (v, s, t). Notice that, when decoding starts, these variables are of order 0(n). 
Each iteration of the decoding procedure described above, amounts to a finite increment 
(or decrement) in these variables. In particular, for the regular case: v decreases by 
one; s decreases by one (the check node chosen at that iteration) and increases by the 
number of degree-two check nodes which are neighbors of the newly deleted variable; t 
decreases by this last quantity. It is easy to realize that the probability distribution of 
these increments (decrements) depends upon v,s,t only on the scale n. More precisely 



the probability of a variation (Aw, As, At) is (up to 1/n corrections) a smooth function 
of v/n, s/n, and t/n. 




Figure 2: A pictorial representation of density and covariance evolution for the 
LDPC(n, x 2 , x 5 ) ensemble. Notice that the ellipsoids corresponding to (s,t) covariances 
should be regarded as living on a smaller (by a factor ^/n) scale than the typical trajec- 
tory. 

Call x(£) the state after £ iterations, and consider the change in state Sx = x(£ + 
S£) — x(£) in a time 5£. If S£ is much smaller than O(n), then also \8x\ <C n. Therefore 
each step in the interval [£, £ + 5£] is independent and identically distributed. If S£ is 
nevertheless much larger than 1, we can apply the central limit theorem to deduce that 
5x is, with good approximation, a multi-dimensional gaussian variable with mean of order 
5£ and standard deviation of order \J~8£. Since, for 5£ ^> 1, \fb£ <C 5£, one can, to a first 
approximation neglect fluctuations. This was the essential step taken in [HIE]. These 
authors showed that x(£) concentrates around its average value x av (£) ~ nz(£/n), with 
z(t) solution of the a set of ordinary differential equations: 

£ = /«fcr). (3) 

These are nothing but the density evolution equations. They are integrated with an 
initial condition depending upon the erasure probability. 

A typical decoding trajectory is reported in Fig. |21 For e < e*, it reaches the (v = 
0, s = 0, t = 0) point before touching the s = plane: decoding is successful. For e > e*, 
it touches the s = plane before the (v = 0, s = 0,t = 0) point: a stopping set has been 
reached. 

Once the typical trajectory is found, one can compute distribution of x(£) — x^ v (£). 
The procedure is conceptually simple. Consider £ m nr for some fixed r and decompose 
the interval [0, £} into sub-intervals of size S£, with 1 5£ n. Within each sub-interval 
we can apply the argument outlined above to show that 5x is gaussian with standard 
deviation of order yM. Gluing together n/5£ such intervals we deduce that x{£) — x av (£) 
is gaussian with standard deviation of order y/n. This has immediate implications for 
the decoding performances. In fact, as soon as the average trajectory passes within a 
distance of order ■sfn above the s = plane, fluctuations will produce a finite probability 
of hitting the plane. Vice-versa, if the average trajectory passes ^fn below the s = 



plane, decoding can nevertheless be successful with finite probability. Recall that the 
erasure probability sets the initial condition. It is easy to realize that a change Se in the 
channel parameter implies a change of order nSe in the average trajectory x av (^). At 
threshold (e = e*), x(£) is just tangent to the s = plane. This implies that the failure 
probability is strictly between and 1 as long as n|e — e*| < s/n. This suggest that the 
scaling form holds with v = 2. 

The above argument can be made both quantitative and rigorous: the procedure for 
computing gaussian fluctuations around the typical trajectory has been named covariance 
evolution [3] and it is not much harder than usual density evolution. We refer to the next 
section for a more precise account of the results. Unhappily, when compared with detailed 
simulations, or with exact computations obtained from recursion relations, the results are 
not very accurate. The reason lies in some subtle effect that produces sizeable corrections 
of the form (J2J). It turns out that u) — 1/6: this implies large corrections even for quite 
large blocklengths in in the range 10 3 -j- 10 5 ). Since it turns out that g(z) oc f'{z), the 
same corrections can be attributed to a finite-length shift of the iterative threshold: 

e*(n) = e* -eln- 2/3 + &(n- 1 ), (4) 

with a positive (ensemble-dependent) constant e*. 
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Figure 3: A pictorial view of decoding trajectories near the critical point. The type of 
trajectory depicted here is responsible for the finite-length shift of the iterative threshold 

©■ 

It is not hard to understand the origin of the shift (j3J) heuristically. In a nutshell: 
when passing ^Jn above the s = plane, the decoding trajectory has many occasions to 
fail. At any time, a fluctuation may imply it touching the plane. On the other hand, for 
decoding to be successful, all fluctuations must be lucky enough to keep the trajectory 
away from the plane. This asymmetry leads to a finite-size lowering of the threshold. 

In order to understand the —2/3 exponent in Eq. (j3J), it may be convenient to consider 
a toy example, cf. Fig. El Here the state is describe by a single integer x, playing the role 
of s in the decoding problem, evolving over discrete time I. Both x and I are typically of 
order n, and the increment probabilities of x depends smoothly on x/n and Ijn. Finally, 
the average trajectory x av (£) has a minimum at with a; av (£*) = 0, and 



(5) 



near the minimum. In other words the minimum is 'non-degenerate'. This is the situation 
for iterative decoding at e = e* under mild conditions on the code ensemble. We want to 
compute the 'failure' probability, P(n) i.e. the probability for the trajectory to touch the 
x = plane, which corresponds to the block error probability in the decoding problem. 

Within a first approximation x(£) is at any time a gaussian variable with mean x SK {£) 
and standard deviation of order ^Jn. The failure probability can be estimated as the 
probability for x(£ = £*) < 0. Since x w {£ = £*) = 0, we get P(n) = 1/2. The crucial 
point is now that, even if x(£*) > 0, the trajectory has some probability for touching the 
x = plane either for £ < £* or for £ > £ jf . What matters is clearly the location of the 
minimum £ g . The trajectory does not touch the x = plane if and only if x(£ g ) > 0. 

Let us now try to estimate the position of the minimum £ g . This is the outcome of a 
balance between two competing forces. On the one hand, the average trajectory is bent 
upward and forces £ g to be close to £*. This yields a contribution to x{€) —x{£*) which is 
of order (£ — £*) 2 /2n. On the other hand, by moving away from £* one can take advantage 
of fluctuations, [x(£) — x(£*)) — [x av (£) — £ av (^*)] which are typically of order \J£ — £*. 
The location of £ g is estimated by balancing these two effects: [l — £*) 2 /2n ~ sjl — 
We get therefore 

\£ g - L\ = 0(n 2 / 3 ) , \x{Q - x(Q\ = 0{n 1 ^) . (6) 

The above argument implies that the failure probability is slightly larger than 1/2. 
One can in fact fail either if x(£*) < (and this happens with probability 1/2), or if 
x{£*) = 0(n 1//3 ) because, in this case, x(£ g ) can be 0(n 1 ^ 3 ) below x(£*). What is the 
probability of x(£*) = 0(n 1//3 )? We know that x(£*) is, with good approximation a 
gaussian variable with mean and standard deviation ©(^/n). Therefore the probability 
is of order n" 1 ^ 2 ■ n 1//3 = n~ 1 ^ . We find therefore the estimate 

P(n) = i + P 1 n- 1 / 6 + ... . (7) 

Remarkably, the constant Pi can be calculated exactly [Sj and depends uniquely upon 
the transition probabilities near Once the axes x and £ have been properly scaled 

Pi is given by an integral expression in terms of Airy functions. 

When adapted to the coding problem, the above argument yields u = 1/6 in Eq. ©. 
If we define the finite-size threshold e*(n) by Pb(?t., e*(n)) = 1/2, we get Eq. (jlj). 

3 Results 

The heuristic arguments presented in the previous Section can be put on precise quanti- 
tative bases. For some of them we were able to provide a rigorous foundation, leading to 
the following: 

Lemma 1 [Scaling of Unconditionally Stable Ensembles] Consider transmission over the 
BEC(e) using random elements from an ensemble LDPC(n, A, p) which has a single critical 
point and is unconditionally stable. Let e* = e*(A,p) denote the threshold and let v* 
denote the fractional size of the residual graph at the critical point corresponding to the 
threshold. Fix z to be z := \fn(e* — e). Let Pb(^, A, p, e) denote the expected bit erasure 
probability and let Pb, 7 (?t-, A, p, e) denote the expected block erasure probability due to 
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Table 1: Thresholds and scaling parameters for some regular standard ensembles A(x) = 
x , p(x) = x k . The shift parameter is given as j3/fl where Q is the universal constant 
defined in Ref. [3] in terms of Airy functions, and whose numerical value is very close to 
1. 

errors of size at least 7Z/*, where 7 G (0, 1). Then as n tends to infinity, 

P B , 7 (n,A,p,e) = QQ(l + 0n (l)), (8) 
P b (n,A,p,e) = z/=Q(£)(l + 0n (l)), (9) 

where a = a(X, p) is a constant which depends on the ensemble. 

Unhappily, we were not able to derive powerful enough estimate for making the 'shift' 
argument rigorous. However, the difficulty is more technical than conceptual. We for- 
mulate therefore the following: 

Conjecture 1 [Refined Scaling of Unconditionally Stable Ensembles] Consider trans- 
mission over the BEC(e) using random elements from an ensemble LDPC(n, A,p) which 
has a single critical point and is unconditionally stable. Let e* = e*(A,p) denote the 
threshold and let u* denote the fractional size of the residual graph at the threshold. Let 
Pb(n, A,p, e) denote the expected bit erasure probability and let Pb )7 (ti, A, p, e) denote 
the expected block erasure probability due to errors of size at least 7^*, where 7 G (0, 1). 

2 

Fix z to be z := ^/n(e* — f3n~z — e). Then as n tends to infinity, 

P B)7 (n,A,p,e) = gQfl + Oln- 1 / 3 )), (10) 
P b (n,A,p,e) = ^Q(l+0(n^)), (11) 

where a = a(A,p) and (3 = /3(A, p) are constants which depend on the ensemble. 

In Table El we report the values of the scaling parameters for a few regular ensembles. 
These parameters are obtained by integrating a set of ordinary differential (covariance 
evolution) equations and are easily pushed to great precision. Finally, in Fig. [U we com- 
pare the refined scaling form provided by Conjecture 1, with the exact error probability 
computed by recursion. The agreement is excellent. 
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Figure 4: Scaling of Pb(w, e) for transmission over BEC(e) and belief propagation de- 
coding. The threshold for this combination is e* ~ 0.42944, see Table El The block- 
lengths/expurgation parameters are n/s = 1024/24, 2048/43, 4096/82 and 8192/147, 
respectively. (More precisely, we assume that the ensembles have been expurgated so 
that graphs in this ensemble do not contain stopping sets of size s or smaller.) The 
solid curves represent the exact ensemble averages. The dashed curves are computed 
according to the refined scaling law stated in Conjecture ^ with scaling parameters 
a = v/0.249869 2 + e*(l-e*) and (3 = 0.616045, see Table El 
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